Systolic random number generator

ABSTRACT

Systems and methods for a random number generator including a systolic array to provide a random number output. In one approach, the systolic array can be arranged in two or greater dimensions, and each cell of the array comprises a ring oscillator. Data is read from a random access memory to provide the inputs to the systolic array. A linear feedback shift register receives the random number output as a feedback signal used to address the memory to read data to provide as the inputs to the systolic array.

RELATED APPLICATIONS

This is a continuation application of U.S. Non-Provisional applicationSer. No. 16/459,080, filed Jul. 1, 2019, entitled “Systolic RandomNumber Generator,” by Richard J. Takahashi, which is a continuationapplication of U.S. Non-Provisional application Ser. No. 16/014,737,filed Jun. 21, 2018, entitled “Systolic Random Number Generator,” byRichard J. Takahashi, which itself is a continuation application of U.S.Non-Provisional application Ser. No. 15/450,531, filed Mar. 6, 2017,entitled “Systolic Random Number Generator,” by Richard J. Takahashi,which itself claims priority to U.S. Provisional Application Ser. No.62/305,065, filed Mar. 8, 2016, entitled “Systolic Random NumberGenerator,” by Richard J. Takahashi, the contents of which applicationsare incorporated by reference in their entirety as if fully set forthherein.

FIELD OF THE TECHNOLOGY

At least some embodiments disclosed herein generally relate to randomnumber generators, and more particularly, but not limited to, a randomnumber generator using cells arranged in a systolic array configuration(e.g., a two-dimensional array of random sources in the form of cellsthat each include an oscillator).

BACKGROUND

A random number generator is a hardware device that generates randomnumbers. One application for random number generators is incryptography, where they are used to generate random cryptographic keys,for example, to transmit data securely. These keys can be used, forexample, in encryption protocols. Another exemplary application is usein any application requiring a random number such as gambling games,methods of statistical analysis, and lottery systems.

SUMMARY OF THE DESCRIPTION

Systems and methods for a random number generator are described herein.Some embodiments are summarized in this section.

In one embodiment, a random number generator includes a systolic arrayconfigured to receive a plurality of first inputs (e.g., input signalsprovided to a top and side of the systolic array), and to provide arandom number output (e.g., for use by a host processor in cryptographicprocessing). The systolic array can be arranged in two or greaterdimensions (e.g., a three-dimensional array).

In one embodiment, at least one memory (e.g., a static random accessmemory (SRAM)) is configured to provide the first inputs to the systolicarray, and further configured to receive the random number output as afeedback signal (e.g., obtained from and clocked by a clock signal froman output register) used for addressing the memory to select the firstinputs provided to the systolic array (e.g., the feedback signal may beprovided to a shift register used to address the memory).

In one embodiment, a systolic array used in a random number generatorcomprises a plurality of cells, and each cell of the systolic arrayincludes an oscillator; a first flip-flop coupled to receive a signalfrom the oscillator as an input and to provide a first output; anexclusive OR gate coupled to receive the first output; and a secondflip-flop coupled to receive a signal from the exclusive OR gate as aninput, and to provide an output to an adjacent cell in the systolicarray.

In one embodiment, a random number generator includes a plurality ofcells arranged in at least a two-dimensional systolic array, each cellcomprising an oscillator, the systolic array to receive a plurality offirst inputs in first and second sides of the array (e.g., a top sideand a left side of the array), and the systolic array to provide arandom number output; at least one memory is configured to provide thefirst inputs to the systolic array; and a shift register (e.g., alinear-feedback shift register) is configured to receive the randomnumber output, and further configured to address the memory to selectthe first inputs to provide to the systolic array.

In one embodiment, each cell of a systolic array in a random numbergenerator includes an oscillator (e.g., a free-running, ringoscillator). In one embodiment, each cell further includes a flip-flopto receive a signal from the oscillator. In one embodiment, each cellfurther includes an exclusive OR gate to receive a signal from theflip-flop as an input to the exclusive OR gate. In one embodiment, eachcell provides an output signal to at least one adjacent cell in thearray (e.g., bottom and right cells). In one embodiment, each cellfurther receives a signal from another cell (e.g., an adjacent top cell)as an input to the exclusive OR gate.

In one embodiment, a random number generator uses a physical unclonablefunction provided by a random access memory (e.g., an SRAM). Examples ofthe random number generator include the various embodiments of randomnumber generators using a systolic array as described herein.

The disclosure includes methods and apparatuses which perform thesemethods, including computing devices and systems which perform thesemethods, and computer readable media containing instructions which whenexecuted on computing devices and systems cause the devices and systemsto perform these methods.

Other features will be apparent from the accompanying drawings and fromthe detailed description which follows.

BRIEF DESCRIPTION OF THE DRAWINGS

The embodiments are illustrated by way of example and not limitation inthe figures of the accompanying drawings in which like referencesindicate similar elements.

FIG. 1 shows an architecture for a random number generator (RNG) using asystolic array, according to one embodiment.

FIG. 2 shows a systolic cell of a systolic array, according to oneembodiment.

FIG. 3 shows a systolic array that can be used in the RNG of FIG. 1 ,according to one embodiment.

FIG. 4 shows a logic gate oscillator design that is programmable usingdifferent delays, for example for use in generating a clock signal fs,and/or for use in a cell of a systolic array, in one embodiment.

FIG. 5 shows a logic gate oscillator design that is programmable usingdifferent delays and a D-type flip-flop for use in a cell of a systolicarray, in one embodiment.

FIG. 6 shows a digital mixer in the form of a flip-flop for use in acell of a systolic array, according to one embodiment.

FIG. 7 shows an exemplary power spectral density.

FIG. 8 shows an ideal spectrum of power spectral densities, according toone embodiment.

DESCRIPTION

The following description and drawings are illustrative and are not tobe construed as limiting. Numerous specific details are described toprovide a thorough understanding. However, in certain instances, wellknown or conventional details are not described in order to avoidobscuring the description. References to “one embodiment” or “anembodiment” in the present disclosure are not necessarily references tothe same embodiment; and, such references mean at least one.

Reference in this specification to “one embodiment” or “an embodiment”means that a particular feature, structure, or characteristic describedin connection with the embodiment is included in at least one embodimentof the disclosure. The appearances of the phrase “in one embodiment” invarious places in the specification are not necessarily all referring tothe same embodiment, nor are separate or alternative embodimentsmutually exclusive of other embodiments. Moreover, various features aredescribed which may be exhibited by some embodiments and not by others.Similarly, various requirements are described which may be requirementsfor some embodiments but not other embodiments.

Various embodiments described in this disclosure provide a true systolicrandom number generator (RNG) design using cells (to provide a pluralityof random sources) in a systolic array configuration, as describedfurther herein.

In one embodiment, the random number generator architecture isimplemented by using one or more field-programmable gate arrays (FPGAs)or integrated circuits.

In one embodiment, each cell of the systolic array comprises at leastone flip-flop, an oscillator, and an exclusive OR gate. In oneembodiment, the clock signal from the oscillator provides a frequencysource that is re-clocked (e.g., using a D-type flip-flop) by anon-coherent or asynchronous clock (e.g., the fs signal below), asdescribed further below. This adds meta-stability from the flip-flop asa random source.

In one embodiment, the systolic array design used is uniform andcellular, such that this design increases or maximizes FPGA logic andminimizes layout resources. In one embodiment, the systolic RNGdescribed herein is implemented in, for example, 8×3 arrays of cells,where each systolic cell has a free-running oscillator and D-typeflip-flops and an exclusive OR (EXOR) gate. This embodiment uses freeoscillators, metastability conditions and an SRAM (e.g., which is eitherpowered-up with random data or pre-loaded with random data) as thephysical unclonable function (PUF) as the sources of randomness.Hardware implementations of this design are, for example, relativelysmall in size, robust, high-speed and have limited single-point failuretolerance. This embodiment provides a hardware-based design using aknown random source and entropy generators.

This disclosure describes the systolic RNG operation in variousembodiments, including a description in some examples of how eachelement functions to add entropy to the RNG design. A true RNG is adesirable component of a cryptographic system.

The disclosure below also describes variations, according to differingembodiments, of optional post-processing of the output from the systolicrandom number generator. This post-processing may be desirable to, forexample, remove possible bias, correlation, and/or any second andthird-order effects from FPGA or integrated circuit fabrication cornercases and environmental factors.

FIG. 1 shows an architecture for a random number generator (RNG) using asystolic array 108, according to one embodiment. The systolic array 108provides a random number output at output registers 114 and 116. Atleast one memory 106 is configured to provide inputs A, B, C, . . . K tothe systolic array 108. Memory 106 provides a physical unclonablefunction for the random number generator, as discussed herein. Theconnection between the memory 106 to the systolic array 108 is connectedthrough a permuter 123. The permutation function of permuter 123 can beprogrammed to connect the memory 106 outputs (A, B, C, . . . K) to thesystolic array 108 inputs (A, B, C, . . . K) in any order ofconnections. For example, these connections can be performed in anydesired order, such as A to A, B to B, A to G, B to J, C to D, etc.,using the permutation function of permuter 123.

A shift register 110 receives the random number output from outputregister 116, and addresses the memory to select the inputs A, B, C, . .. K to provide to the systolic array 108.

A shift register 110 receives the random number output as an input 120,and provides an output 122 used for addressing memory 106. A clock 102provides a clock signal fs to each of a plurality of cells (see FIG. 3 )in systolic array 108. The clock signal fs controls providing of therandom number output (stored in output register 116) as the feedbacksignal to shift register 110. The clock signal fs clocks the providingthe clocking to the output register 122 A, B, C, . . . K from memory106, to the permutation function of permuter 123, to systolic array 108input D-Flip-flops. In one embodiment, shift register 110 is alinear-feedback shift register that provides non-linear addressing tothe memory 106. As one example, shift register 110 can use a Galoisconfiguration.

Output registers 114 and 116 are each coupled to receive and store therandom number output. Clock signal fs is provided to output register 116to clock the providing of the feedback signal to shift register 110.

The clock signal is also provided to a register 122 coupled to memory106. The clock signal clocks providing of the data that is read frommemory 106, stored in register 122, and then provided as inputs A, B, C,. . . K to the systolic array 108. The clock signal fs also clocks theproviding of an output signal from a flip-flop of each cell in asystolic array 108 to adjacent cells in the systolic array, as discussedin more detail below (see FIG. 3 and discussion below). The clock signalis further coupled to shift register 110 to control providing theaddress 122 to memory 106.

Exclusive OR gates 112 and 118 each receive outputs from at least twocells in the systolic array (see FIG. 3 below). The output from eachexclusive OR gate provides the random number output to output registers114 and 116. Output register 114 provides the random number output tohost processor 104 via a host interface.

In one embodiment, a systolic RNG architecture consists of a systolicrandomizer array, an interface for output register(s), and an SRAMphysical unclonable function (PUF) with a linear-feedback shift register(LFSR) (e.g., of any type and register length of polynomials, Fibonaccior Galois or any non-linear counter design address registers), allclocked by a sampling clock fs (the sampling clock signal is illustratedin various embodiments herein, and is sometimes referred to herein as“fs”).

In one embodiment, the use of an integrated circuit static random accessmemory (SRAM) design and fabrication processes provides SRAM cellpower-up states that are unpredictable. The systolic array 108 in oneexample uses an 8-by-3 cell array in the systolic RNG. In othervariations, the systolic array design can use larger (or smaller) arraysizes.

In this embodiment, the SRAM PUF is used to seed the RNG systolic arrayat power-up. The SRAM will not have any reset and will power-up in anunknown state and will be unique to each FPGA or integrated circuit.

In one embodiment, the memory 106 is an SRAM that is addressed by anLFSR (linear feedback shift register with no reset) and also willpower-up to an unknown state to address the SRAM, and then send the SRAMcontents (as inputs A, B, C, . . . K as described above) to seed thesystolic array 108.

In one example, the systolic array requires 15 fs clocks, or clocksgreater than the LFSR size, prior to reading the random number output.This is required to flush the LFSR registers. The SRAM is addressed bythe LFSR as discussed above. The LFSR will be loaded on eachend-of-cycle count by the random number output of the systolic array (asprovided by output register 116). This assures that the reading of theSRAM will be different on each end-of-address count cycle.

In one embodiment, the components of the RNG design include a systolicarray, a SRAM PUF, and a host interface (e.g., the host interfaces to acentral processing unit (CPU) of host processor 104). In thisembodiment, these components operate with a single fs clock domain. Therandom sources for this RNG are based on each systolic cell (see FIG. 3) having a free-running asynchronous oscillator re-clocked by anasynchronous fs clock with D-type flip flop (also referred to below as adigital mixer) and EXOR'ed by inputs from adjacent cells of the systolicarray.

In one embodiment, the D-type flip flop performs the frequency mixingoperation between fs and the free-running asynchronous oscillator undera variety of input frequency conditions. The D-type flip flop outputsignal will alternate at the difference frequency (i.e., fs−free-runningasynchronous oscillator signal). The output frequency is a rectangularwave whose long-term average frequency is the desired differencefrequency. In addition, the output signal contains frequency jitter (adesired effect), which is a function of the relationship between the twoinput frequencies.

In one embodiment, the final output from the systolic array is EXOR'edin a fail-safe configuration into, for example, two 32-bit registers 114and 116 that host processor 104 will read. The host processor 104 readsboth registers for data, and also checks for any invalid outputs (e.g.an output of all zeros or all ones).

A non-limiting example of a computing device that can be used as hostprocessor 104 (e.g., to use the random number output in cryptographicprocessing) is described in U.S. Non-Provisional application Ser. No.14/177,392, filed Feb. 11, 2014, entitled “SECURITY DEVICE WITHPROGRAMMABLE SYSTOLIC-MATRIX CRYPTOGRAPHIC MODULE AND PROGRAMMABLEINPUT/OUTPUT INTERFACE,” by Richard J. Takahashi, which is herebyincorporated by reference in its entirety.

FIG. 2 shows a systolic cell 200 as used in systolic array 108,according to one embodiment. The random number generator includes aplurality of cells arranged in at least a two-dimensional systolic array108 (see FIG. 3 ). Each cell includes an oscillator 202, a flip-flop 204coupled to receive an output signal from the oscillator 202 as an inputand to provide a first output to an exclusive OR gate 206. Flip-flop 208is coupled to receive a signal from the exclusive OR gate 206 as aninput, and to provide output signals 214 and 216 to adjacent cells (notshown in FIG. 2 ) in the systolic array 108. The exclusive OR gate 206is further coupled to receive output signals 210 and 212 from twoadjacent cells (not shown in FIG. 2 ) of the systolic array.

Each of flip-flops 204 and 208 is clocked by clock signal fs from clock102. In one embodiment, the frequency signal provided by oscillator 202has a frequency less than the frequency fs of the clock signal.

In one embodiment, each systolic cell includes a free-runningasynchronous oscillator clocked by clock signal fs, a D-type flip-flop,and a three-input EXOR gate re-clocked by another D-type flip-flop(digital mixer). The free-running asynchronous oscillator in each cellis also a PUF in that the final frequency is unique to each FPGA orintegrated circuit. Each FPGA or integrated circuit (IC) is typicallynot manufactured exactly the same. Each FPGA or IC is unique in itsfabrication process and operates uniquely (e.g., a function offabrication process, voltage, and temperature across the integratedcircuit), but are still within manufacturing tolerances.

In one embodiment, each cell 200 in the array receives data inputs fromside and top cells (also see FIG. 3 ). The cell 200 is basically adigital mixer. As the data is cascaded into the array, each cell addsthe random output from each adjacent cell and finally outputs to theoutput registers 114 and 116. The digital frequency mixing is cascadedthroughout the systolic matrix as random functions (i.e., use of thefree-running asynchronous oscillator, and re-clocking with fs causesflip-flop metastability).

In one embodiment, jitter that is present is a contributor to the D-typeflip-flop metastability, the fs clock is asynchronous to the freerunning oscillators (e.g., at a given prime number vale MHz frequency),and the clock adds to the uncertainty of each systolic cell output. Thedigital mixer output sequence from the free-running ring frequencyoscillator provides an unknown state output given that a metastablestate occurs as result of set-up or hold-time violations between theF_(freq) and the sampling clock fs. The fs clock signal that isgenerated and the free-running asynchronous oscillator clock signal willhave clock cycle-to-cycle jitter. This jitter will break-up contiguoussequences of outputs, or will delete or skip the sampled output from thefree-running oscillator.

As one example, the cell oscillator frequency will be set at an 11-to-1ratio of the fs frequency. For example, if fs is 75 MHz, the cellfree-running ring frequency oscillator should be set for 825 MHz. Thisfrequency ratio should be designed to be a prime number. In oneembodiment, another approach used to provide dis-contiguous outputs byspreading the data from the free-running frequency oscillator ismetastability as an isolation technique. Metastability performs the sameresults, in that the output from a D-type flip-flop (digital mixer) thathas its set-up time violated will produce an unpredictable output. Thistechnique adds to the randomness when tuned to violate the set-up timeof a D-type flip flop at all times during operation. If FPGAs orintegrated circuits are used to implement the RNG, this technique istechnology and process dependent, and it provides further uncertainty.

FIG. 3 shows systolic array 108 as used in the RNG of FIG. 1 , accordingto one embodiment. Systolic array 108 comprises cells 200 as discussedabove. Systolic array 108 in general can have an arbitrary row by columnsize.

In general, as the array size increases, the randomness of the dataincreases. An example of an array size has dimensions of at least eightcells in each dimension.

Output signals from two or more cells 200 are provided as input signalsto each of exclusive OR gates 112 and 118. The outputs from these gatesare provided for storage in output registers 114 and 116 as the randomnumber output discussed above.

In one embodiment, the systolic array consists of an 8×3 array with 24cells and with 24 free-running ring oscillators (one oscillator 202 foreach cell 200). This design can be scaled to additional ring oscillatorsas required. The left and top side inputs to the systolic array areprovided from the SRAM PUF, as discussed above. After power-up, theSRAM's individual memory cells will power-up into an unknown state, andthe SRAM data content is read into the systolic array as the initialseed values.

Additional details regarding random number generation using logic gates,and use of a ring oscillator in an RNG is provided in the followingtechnical papers, which are hereby incorporated by reference herein intheir entirety:

-   -   “High-Speed True Random Number Generation with Logic Gates        Only”, Markus Dichtl1 and Jovan Dj. Goli'c2.    -   “Fast Digital TRNG Based on Metastable Ring Oscillator” Ihor        Vasyltsov, et. al

FIG. 4 shows a logic gate oscillator design that is programmable usingdifferent delays for use in generating the clock signal fs, and/or foruse in a cell of a systolic array, in various embodiments.

In one embodiment, the systolic cell free-running ring frequencyoscillator design used in each cell of the systolic array consists ofinverters and NAND gates with an enable input (“Enable” as illustrated).Each oscillator leg is digitally mixed with a non-coherent fs sampleclock. In this embodiment, the fs clock has, for example, at least 12nanoseconds of cycle-to-cycle timing jitter. The fs clock also is afree-running ring frequency oscillator with jitter provided through FPGAor integrated circuit fabrication process-dependent factors such asregenerative logic threshold, thermal, and shot-flicker noise viacascading strings of gates, which all provide random contributingfunctions to the RNG. The higher the number of gates, the greater thejitter for the fs clock.

The systolic cell frequency asynchronous oscillators are clocked usingfree-running oscillator fs. FIG. 5 shows an oscillator with a D-typeflip-flop for use in a cell of a systolic array, in one embodiment. FIG.5 illustrates the free-running asynchronous oscillators and D-typeflip-flop (mixer) (see digital mixer discussion herein). It should benoted that the fs asynchronous clock frequency selection is technologydependent and preferably should be in, for example, the low hundreds tomillions of MHz frequency range, and the fs clock frequency shouldfurther preferably be a prime value.

It is also preferred that the input to the D-type flip-flop be a primefrequency relative to fs. In one example, fs is 79 MHz, and the D-typeflip-flop input frequency can be 763 MHz (a prime number).

In one embodiment, the free-running asynchronous oscillator is designedusing ring oscillators implemented in logic gates with a feedback delay(illustrated in FIG. 5 as “Delay”) to determine the frequency. In animplementation using FPGA gates, the actual frequency may not be a primenumber. Instead, the frequency is a function of the actual designefforts to simulate gate delays, interconnect delay, and each gate'sloading. Since the ring oscillators are designed using gate delays in afeedback configuration, each oscillator will build-up jitter as moregates are used. Therefore, a lower frequency will have a larger timingjitter, and thus larger or wider power spectrums. The jitter of the ringoscillator frequency and instability is a source of randomness.

FIG. 6 shows a digital mixer in the form of a flip-flop for use in cell200 of systolic array 108, according to one embodiment. The mixing oftwo inputs is accomplished using this digital mixer. Although theillustrated digital mixer is similar to an input synchronizer, thedigital mixer has distinct features if the two square wave inputs slowly“slip cycles” (also, see above mixer discussion).

The D-type flip-flop is used here as a mixer where the output Q providesa difference between two square waves of different frequencies providedas input signals. The D-type flip-flop is a simple form of a mixer. Inother embodiments, more complex designs can be used based on thedescription provided herein (also, see above mixer discussion).

The output of the D-type flip-flop alternates at the difference of(fs−F_(freq)) (i.e., the absolute value of the difference), where fs isthe clock signal input from clock 102 (discussed above) and F_(freq) isthe D-type input to the D-type flip-flop. Conversely, the Q output is alogic zero if the input is a logic zero. However, if the two frequenciesare non-coherent and slip cycles, the output will result in thefs−F_(freq) frequency difference. Also, if the F_(freq) signal is an oddintegral multiple of the fs signal, the Q output generates rapidlyalternating one and zero patterns.

In the above embodiment, the pattern differences of the mixer areincreased where F_(freq) and fs have cycle-to-cycle jitter. Thisuncertainty originates from an unstable frequency source. This cycle-tocycle jitter adds to the uncertainty of the F_(freq)−fs output includingthe second-order effects of the D-type flip-flop metastability. Thejitter is used as an isolation value to break-up any sequence that mayoccur with a jitter-free clock.

Another feature of the digital mixer of FIG. 6 is that the outputfrequency has cycle-to-cycle timing phase jitter. This jitter is alsorelated to phase jitter of the output difference frequency. The fs clockfrequency determines the maximum jitter of the Q output. Since the inputfrequency F_(freq) is sampled by fs, the worst case output frequency hasa maximum jitter of fs divided by two on a cycle-to-cycle slip. In therandomizer design (digital mixer) the greater the difference between fsand F_(freq), the greater the output frequency jitter. Given that fs isa perfect jitter source and that F_(freq) is the free-running oscillatorfrequency to be mixed, and that both inputs are non-coherent and willslip-cycle, the Q output appears random.

In one embodiment, a power spectral density for the digital mixer Qoutput exhibits a Gaussian or normal density. Since the outputeffectively is fs−F_(freq) plus or minus frequency deviations due to theslip cycle, the greater the jitter of the fs, the greater the frequencydeviation. In a practical design, this frequency deviation has an upperbound. The output frequency will have a band of frequencies about thefs−F_(freq) center frequency.

FIG. 7 shows an exemplary power spectral density. The power spectraldensity of a typical digital mixer is illustrated in FIG. 7 . The powerspectral density mixer output is a Fourier transform of the averageautocorrelation of the output time sequence. A single digital mixertypically does not provide a good random source. Given multiple digitalmixers with a uniform frequency spread for a KHz to GHz range, theresult is a wider power spectral density with individually-centeredpower spectral densities. The KHz to GHz range is desirable, but the lowfrequency harmonics will adequately cover this range. An ideal powerspectrum of power spectral densities typically appears as illustrated inFIG. 8 , where f1, f2, . . . fn is the center frequency of each digitalmixer with jitter-induced frequency deviations.

In one embodiment, the selection of a center frequency is chosen basedon a prime number as a guideline. The reason for a prime number value isthe physical phenomenon of an adjacent frequency coherently coupling viacapacitance between oscillators or digital mixer outputs. Primefrequency harmonics tend not to couple. Therefore, the frequency iscalculated to be a prime at the output of the mixer, and not the outputof the free-running oscillator. In some designs, it is difficult tomaintain this prime number value, and if physical isolation can beimplemented, this coupling will be reduced or eliminated as a problem.Also, in FPGA designs, it is possible to manually insert the RNGoscillators into separate rows to reduce coupling. FIG. 8 shows an idealspectrum of power spectral densities, according to one embodiment.

Output Registers

In one embodiment, the output of the systolic RNG is connected to two32-bit or other size output registers 114, 116, as illustrated in FIG. 1. The systolic array cell outputs are EXOR'ed by gates 112, 118 to forma complex output sequence as the results. These results are clocked outto a host processor bus to host processor 104 using, for example, aready flag or interrupt. The output of the systolic array 108 is clockedinto the 32-bit register 116 and 114, as illustrated in FIG. 1 , usingclock 102 and read by the host processor 104.

In one embodiment, the host processor 104 uses 32-bit registers 116 and114 to read the RNG. The EXOR function of gates 112, 118 is used toremove possible bias from the systolic array's output. In otherembodiments, various other different EXOR configurations can be used toremove bias from the systolic RNG array.

In one embodiment, after power up of the RNG, as an example, 15 orgreater fs clocks will initialize the systolic RNG array. Then, the datafrom the RNG systolic array can be read into the two output registers114 and 116 (each illustrated in FIG. 1 ). The data stored in the outputregister 114 or 116 in FIG. 1 is the 32-bit random number result. Atthis point, the randomizer can be read (e.g., read by host processor104).

In one embodiment, EXOR'ing the outputs from the cells of the RNGsystolic array improves the statistical output of random data. Varyingthe EXOR configurations also can be used to tune the randomizer outputquality. Adding other non-linear gates coupled with this EXOR logic suchas, for example, NAND gates can also improve the randomizer output.

Exemplary Designs

In one non-limiting example, the following guidelines are used duringdesign of the RNG system. In this example, each oscillator is designedwith an enable input. During power-up, it is desirable to keep eachoscillator leg in a disabled state until VDD (e.g., the DC power to theintegrated circuit or other computing device in which the RNG is formed)is stabilized. This assures that the oscillator begins in a stable mode.

In one embodiment, before accessing output data from the RNG, there is await, for example, of 15 fs or greater clocks prior to utilizing the RNGsystolic array. Each frequency oscillator is designed with a primenumber delay value as a design guideline. A minimum of three invertinggates is used in the ring oscillator feedback in order to generate ahigh-frequency and noise.

Post-Processing

In one embodiment, during the read of the RNG, host processor 104 canperform additional post-processing functions on the RNG output. Also,host processor 104 can perform statistical checks for possible RNGoutput failures such as output data at the output register that is, forexample, all zeroes or all ones, or alternating ones and zeroes, orrepeating patterns, with each situation indicating an RNG hardwarefailure.

Closing

In one embodiment, the random number generator above is made in acomputing device using FPGAs or ASICs by programming or implementing theRNG using a high-level design language, such as VHDL or Verilog.

At least some aspects disclosed can be embodied, at least in part, insoftware. That is, the techniques may be carried out in a computersystem or other data processing system in response to its processor(s),such as a microprocessor, executing sequences of instructions containedin a memory, such as ROM, volatile RAM, non-volatile memory, cache or aremote storage device.

In various embodiments, hardwired circuitry (e.g., one or more hardwareprocessors or other computing devices) may be used in combination withsoftware instructions to implement the techniques above (e.g., thecommunication system may be implemented using one or more computingdevices). Thus, the techniques are neither limited to any specificcombination of hardware circuitry and software nor to any particularsource for the instructions executed by the data processing system.

In one embodiment, a computing device may be used that comprises aninter-connect (e.g., bus and system core logic), which interconnects amicroprocessor(s) and a memory. The microprocessor is coupled to cachememory in one example.

The inter-connect interconnects the microprocessor(s) and the memorytogether and also interconnects them to a display controller and displaydevice and to peripheral devices such as input/output (I/O) devicesthrough an input/output controller(s). Typical I/O devices include mice,keyboards, modems, network interfaces, printers, scanners, video camerasand other devices which are well known in the art.

The inter-connect may include one or more buses connected to one anotherthrough various bridges, controllers and/or adapters. In one embodimentthe I/O controller includes a USB (Universal Serial Bus) adapter forcontrolling USB peripherals, and/or an IEEE-1394 bus adapter forcontrolling IEEE-1394 peripherals.

The memory may include ROM (Read Only Memory), and volatile RAM (RandomAccess Memory) and non-volatile memory, such as hard drive, flashmemory, etc.

Volatile RAM is typically implemented as dynamic RAM (DRAM) whichrequires power continually in order to refresh or maintain the data inthe memory. Non-volatile memory is typically a magnetic hard drive, amagnetic optical drive, or an optical drive (e.g., a DVD RAM), or othertype of memory system which maintains data even after power is removedfrom the system. The non-volatile memory may also be a random accessmemory.

The non-volatile memory can be a local device coupled directly to therest of the components in the data processing system. A non-volatilememory that is remote from the system, such as a network storage devicecoupled to the data processing system through a network interface suchas a modem or Ethernet interface, can also be used.

In one embodiment, a data processing system such as the computing deviceabove is used to implement the random number generator and/or hostprocessor.

In one embodiment, a data processing system such as the computing deviceabove is used to implement a user terminal, which may provide a userinterface for control of a computing device. For example, a userinterface may permit configuration of the encryption gateway. A userterminal may be in the form of a personal digital assistant (PDA), acellular phone or other mobile device, a notebook computer or a personaldesktop computer.

In some embodiments, one or more servers of the data processing systemcan be replaced with the service of a peer to peer network of aplurality of data processing systems, or a network of distributedcomputing systems. The peer to peer network, or a distributed computingsystem, can be collectively viewed as a server data processing system.

Embodiments of the disclosure can be implemented via themicroprocessor(s) and/or the memory above. For example, thefunctionalities described can be partially implemented via hardwarelogic in the microprocessor(s) and partially using the instructionsstored in the memory. Some embodiments are implemented using themicroprocessor(s) without additional instructions stored in the memory.Some embodiments are implemented using the instructions stored in thememory for execution by one or more general purpose microprocessor(s).Thus, the disclosure is not limited to a specific configuration ofhardware and/or software.

In this description, various functions and operations may be describedas being performed by or caused by software code to simplifydescription. However, those skilled in the art will recognize what ismeant by such expressions is that the functions result from execution ofthe code by a processor, such as a microprocessor.

Alternatively, or in combination, the functions and operations can beimplemented using special purpose circuitry, with or without softwareinstructions, such as using an Application-Specific Integrated Circuit(ASIC) or a Field-Programmable Gate Array (FPGA). Embodiments can beimplemented using hardwired circuitry without software instructions, orin combination with software instructions. Thus, the techniques arelimited neither to any specific combination of hardware circuitry andsoftware, nor to any particular source for the instructions executed bythe data processing system.

At least some aspects disclosed can be embodied, at least in part, insoftware. That is, the techniques may be carried out in a computersystem or other data processing system in response to its processor,such as a microprocessor, executing sequences of instructions containedin a memory, such as ROM, volatile RAM, non-volatile memory, cache or aremote storage device.

Hardware and/or software may be used to implement the embodiments above.The software may be a sequence of instructions referred to as “computerprograms.” The computer programs typically comprise one or moreinstructions set at various times in various memory and storage devicesin a computer, and that, when read and executed by one or moreprocessors in a computer, cause the computer to perform operationsnecessary to execute elements involving the various aspects.

Software used in an embodiment may be stored in a machine readablemedium. The executable software, when executed by a data processingsystem, causes the system to perform various methods. The executablesoftware and data may be stored in various places including for exampleROM, volatile RAM, non-volatile memory and/or cache. Portions of thissoftware and/or data may be stored in any one of these storage devices.Further, the data and instructions can be obtained from centralizedservers or peer to peer networks. Different portions of the data andinstructions can be obtained from different centralized servers and/orpeer to peer networks at different times and in different communicationsessions or in a same communication session. The data and instructionscan be obtained in entirety prior to the execution of the applications.Alternatively, portions of the data and instructions can be obtaineddynamically, just in time, when needed for execution. Thus, it is notrequired that the data and instructions be on a machine readable mediumin entirety at a particular instance of time.

Examples of computer-readable media include but are not limited torecordable and non-recordable type media such as volatile andnon-volatile memory devices, read only memory (ROM), random accessmemory (RAM), flash memory devices, floppy and other removable disks,magnetic disk storage media, optical storage media (e.g., Compact DiskRead-Only Memory (CD ROMS), Digital Versatile Disks (DVDs), etc.), amongothers. The computer-readable media may store the instructions.

In general, a tangible machine readable medium includes any mechanismthat provides (e.g., stores) information in a form accessible by amachine (e.g., a computer, network device, personal digital assistant,manufacturing tool, any device with a set of one or more processors,etc.).

Benefits, other advantages, and solutions to problems have beendescribed herein with regard to specific embodiments. However, thebenefits, advantages, solutions to problems, and any elements that maycause any benefit, advantage, or solution to occur or become morepronounced are not to be construed as critical, required, or essentialfeatures or elements of the disclosure.

No claim element herein is to be construed under the provisions of 35U.S.C. 112, sixth paragraph, unless the element is expressly recitedusing the phrase “means for.”

In the foregoing specification, the disclosure has been described withreference to specific exemplary embodiments thereof. It will be evidentthat various modifications may be made thereto without departing fromthe broader spirit and scope as set forth in the following claims. Thespecification and drawings are, accordingly, to be regarded in anillustrative sense rather than a restrictive sense.

1-20. (canceled)
 21. A random number generator comprising: a pluralityof cells configured in an at least two-dimensional array, each cellcomprising an oscillator, wherein the at least two-dimensional array isarranged to provide a random number output; a physical unclonablefunction configured to provide input to the at least two-dimensionalarray; and a shift register configured to receive the random numberoutput from the systolic array and to provide a feedback signal to thephysical unclonable function for selecting the input.
 22. The randomnumber generator of claim 21, wherein the oscillator in each cell is aring oscillator.
 23. The random number generator of claim 21, whereinthe physical unclonable function comprises a memory device.
 24. Therandom number generator of claim 23, wherein the feedback signal is usedto address the memory device.
 25. The random number generator of claim21, wherein the physical unclonable function, as directed by the shiftregister, is used to seed the oscillators in the systolic array.
 26. Therandom number generator of claim 21, wherein each cell further comprisesa flip-flop configured to receive an output from the oscillator and toprovide an input to an exclusive OR gate.
 27. The random numbergenerator of claim 26, wherein the exclusive OR gate is further coupledto receive inputs from two adjacent cells in the at leasttwo-dimensional array.
 28. The random number generator of claim 26,wherein the flip-flop is a first flop-flop, and wherein each cellfurther comprises a second flip-flop coupled to receive an input fromthe exclusive OR gate and to provide an output to an adjacent cell inthe at least two-dimensional array.
 29. The random number generator ofclaim 21, further comprising a clock configured to provide a signal tocontrol providing of the feedback signal to the physical unclonablefunction.
 30. The random number generator of claim 21, wherein the shiftregister comprises a linear feedback shift register.
 31. A systemcomprising: a plurality of cryptographic modules each configured toprocess packets, wherein the packet processing includes encrypting apacket based on a security key associated with the packet; and asystolic array configured to receive a plurality of inputs, and toprovide a random number output to one or more of the plurality ofcryptographic modules.
 32. The system of claim 31, further comprising:memory configured to provide the inputs to the systolic array, andfurther coupled to receive a signal based on the random number output,wherein the signal is used for selecting the inputs provided to thesystolic array.
 33. The system of claim 31, further comprising: a firstinterface coupled to the plurality of cryptographic modules andconfigured to receive an incoming packet, associate a first security keywith the incoming packet, select one of the plurality of cryptographicmodules, and route the incoming packet to the selected cryptographicmodule.
 34. The system of claim 31, wherein the systolic array comprisesa plurality of cells, each cell including an oscillator.
 35. The systemof claim 31, further comprising a shift register configured to: receivethe random number output; and provide an output used for addressing thememory to select the inputs provided to the systolic array.
 36. A systemcomprising: a random number generator comprising: a systolic arrayconfigured to provide a random number value, wherein the systolic arraycomprises a plurality of cells, and each cell comprises an oscillator; aplurality of programmable cryptographic devices, each cryptographicdevice configured to receive the random number value; and at least oneprogrammable input/output interface configured to route each of aplurality of incoming packets to one of the cryptographic devices forencryption.
 37. The system of claim 36, wherein: the programmableinput/output interface is programmable to support different interfaceprotocols, and each of the plurality of cryptographic devices isprogrammable to support different encryption protocols.
 38. The systemof claim 36, wherein each of the plurality of programmable cryptographicdevices comprises at least one of a programmable systolic packet inputengine, a programmable systolic cryptographic engine, or a programmablesystolic packet output engine.
 39. The system of claim 36, wherein theoscillator is a free-running asynchronous oscillator.
 40. The system ofclaim 39, wherein each cell further comprises a flip-flop that receivesan input signal from the oscillator.